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Abstract 

A consensus linkage map of Picea abies, an economically Important conifer, was constructed based on the segregation of 
686 SNP markers in a progeny population consisting of 247 individuals. The total length of 1889.2 cM covered 96.5% of 
the estimated genome length and comprised 12 large linkage groups, corresponding to the number of haploid P. abies 
chromosomes. The sizes of the groups (from 5.9 to 9.9% of the total map length) correlated well with previous estimates of 
chromosome sizes (from 5.8 to 10.8% of total genome size). Any locus in the genome has a 97% probability to be within 10 
cIVl from a mapped marker, which makes the map suited for QTL mapping. Infecting the progeny trees with the root rot 
pathogen Heterobasidion parviporum allowed for mapping of four different resistance traits: lesion length at the inoculation 
site, fungal spread within the sapwood, exclusion of the pathogen from the host after initial infection, and ability to prevent 
the infection from establishing at all. These four traits were associated with two, four, four and three QTL regions 
respectively of which none overlapped between the traits. Each QTL explained between 4.6 and 10.1% of the respective 
traits phenotypic variation. Although the QTL regions contain many more genes than the ones represented by the SNP 
markers, at least four markers within the confidence intervals originated from genes with known function in conifer defence; 
a leucoanthocyanidine reductase, which has previously been shown to upregulate during H. parviporum infection, and three 
intermediates of the lignification process; a hydroxycinnamoyi CoA shikimate/quinate hydroxycinnamoyltransferase, a 4- 
coumarate CoA ligase, and a R2R3-MYB transcription factor. 
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Introduction 

Norway spruce \Picea abies (L.) Karst.] is ecologically one of the 
most important conifer species in Europe, naturally present in 
central Europe across the Alps, the Carpathians and the Balkans, 
and in northern Europe from the west coast of Norway to far into 
the Russian mainland [1]. It is also one of the most important 
species economically in European and Swedish forestry, consti- 
tuting 41% of the standing Swedish tree volume [2]. This 
dominance means that vast areas of forest land are largely 
monocultural, covered by only P. abies. Albeit favouring a single 
species has been a sound strategy in terms of generating forest 
revenue, this approach has also been extremely beneficial to the 
forest pathogen Heterobasidion parviporum, which is a highly 
competitive early colonizer of fresh wounds and newly cut stumps. 
H. parviporum is a causal agent of annosum root rot, a serious and 
very common fungal disease in conifer forests of the Northern 



Hemisphere [3] causing yearly losses to European forest owners 
exceeding € 790 million yearly in growth reduction and 
devaluation of timber [4]. In 1986, the average incidence of root 
rot in Swedish Norway spruce trees was estimated to 15% [5], and 
the frequency has later been reported to increase in managed 
forests with ~23% per decade [6]. 

Trees less susceptible to root rot would be highly coveted by the 
forest industry. Nevertheless, most breeding endeavours so far 
have been directed at high tree growth, with little or no effort 
targeted at breeding for trees resistant to H. parviporum infection. 
Still, resistance breeding is potentially fruitful, as it has been shown 
that H. parviporum resistance is a genetically variable trait and 
does not affect growth rate negatively [7]. It would be an 
important step towards efficient resistance breeding to have 
molecular resistance-associated markers as well as better knowl- 
edge of whether resistance is a multifactorial trait or instead, 
mainly controlled by a single locus. One way to shed light on this is 
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through quantitative trait loci (QTL) mapping of P. abies 
resistance to H. parviporum infection. 

Mapping of QTLs in genomes such as those of conifers of the 
Pinaceae family is fraught with various problems. The huge size of 
the Pinaceae genomes alone (~ 14-37 Gb) [8-9] has its drawbacks, 
but the extremely low gene density in combination with great 
regions of repeatable elements with low recombination rate [10] is 
an even greater obstacle in terms of converting identified QTLs to 
underlying sequence variation. In P. abies, the haploid genome is 
estimated to 19.6 Gb and the number of transcribed genes to 
28,354 [11], making the average distance between two neigh- 
bouring genes 0.69 Mb. The proportion of various repetitive 
elements in the genome has been estimated to 70% [1 1], making 
whole genome assembly a difficult task where very few, if indeed 
any, scaffolds are likely to be large enough to contain more than a 
single gene. This makes it very hard to know the genome sequence 
between two given P. abies loci. In terms of making sense out of a 
QTL, this means that only markers originating from a known 
position are of potential use; anonymous markers are almost 
impossible to translate unless they originate from the same scaffold 
as a sequence-tagged marker. It also means that if several markers 
within a QTL are significandy associated with the trait, only the 
ones originating from the gene actually controlling the trait wiU 
carry any immediately useful information. This is a significant 
limitation compared to QTLs in maps based on less fragmented 
genome sequences since the search for potential candidate genes is 
more likely to be rewarding if the entire QTL is located to a single, 
large scaffold. On the other hand, the low gene density also means 
that there will be a limited number of possible candidate genes in 
the vicinity of each marker associated with the trait. Despite these 
potential obstacles, QTL analyses have been applied previously to 
map host resistance in similar pathosystems. For example, Pinus 
radiata resistance to Dothistroma needle blight has been found to 
be controlled by multiple QTLs [12], and two major QTLs 
explained b2% of the variance in Eucalyptus globulus resistance to 
Mycosphaerella cryptica infection [13]. Furthermore, Eucalyptus 
inter-specific hybrids have been used to show a multifactorial 
control of resistance to Puccinia psidii rust infection [14], and rust 
resistance in both Populus and Salix spp. is based on several loci 
[1,5-16]. 

Several linkage maps of P. abies have been constructed already, 
but only one [17-18] has been saturated enough to create a 
number of linkage groups corresponding to the haploid number of 
chromosomes. However, of the 768 markers in that map, only 32 
were derived from ESTPs and thus originating from known genes, 
crucial for making use of QTLs in fragmented genome assemblies. 
Another Picea map, based on a Picea mariana x Picea ruhens 
hybrid population, consisted of 835 positioned markers, of which 
318 originated from known genes [19]. In terms of mapped genes, 
the most comprehensive conifer maps to date has been constructed 
for Picea glauca [20], containing 1743 SNP markers from 
transc:ribed genes, and t^\ ice for Pinus taeda [21—22], based on 
2841 and 2393 segregating genes. 

The goal of this study was to map resistance QTLs in P. abies 
involved in response to four aspects of//, paniiporum infection. 
The map was constructed using 247 fuU-sib progenies from a cross 
between two P. abies parents, based on the segregation of 686 
polymorphic SNP markers, all derived from known P. abies genes. 
All four traits were controlled by multiple loci, explaining between 
4.6 and 10.1% of the phenotypic variance in the mapping 
population. If the effect of all loci is additive, the variation within 
the four traits would be explained by in total 12-25.7%. These are 
the first resistance QTLs identified against this fungus, and also the 
first identified against any pathogen in P. abies. Naturally, the 



SNPs underlying these QTLs only represent a fraction of the 
actual genes and might not be the ones causal for the traits. 
Nevertheless, the SNPs originated from genes with homologs to at 
least four plausible candidate genes. One of these, a leucoantho- 
cyanidine reductase, has previously been shown to be upregulated 
in P. abies in response to H. parviporum infection [23]. 

Materials and Methods 

The mapping population and resistance assays 

The biological material used to construct the linkage map and 
conduct the resistance assays was a family of originally 251 
progenies (later adjusted to 247), with six ramet cuttings from each 
original ortet, from a cross between the Picea abies parents 
S21K7622162 and S21K7621678. The biological material and 
the virulence assays have been described previously [24]. In short, 
the resistan[:e of the progeny was estimated by infecting four 
ramets of each progeny, as 2-year-old potted plants, with a 
Heterobasidion parviporum (strain Rb 175) infested wooden plug 
through a 5x5 mm cambial wound. The entire assay was 
conducted in the greenhouse. After 4 weeks, death of the inner 
bark (lesion) was noted and stems were cut aseptically into 5-mm 
lengths that were incubated in moist condition. Sapwood growth 
of H. parviporum was assessed from the presence of conidiophores 
on sections of the stem. These data were treated as best hnear 
unbiased prediction (BLUP) values [24] . Resistance was estimated 
by measuring lesion lengths in the bark surrounding the wound, 
fungal growth within sapwood up- and down-stem from the 
wound and fungal exclusion, i.e. the abihty of the host to exclude 
the fungus once it has entered the sapwood. The fungal exclusion 
concept was based on the observation that after 4 weeks infection, 
the fungus was not always continuously present in the sapwood all 
the way from the wound to its farthest frontiine but rather seemed 
to lose substrate to the host, whose ability to reclaim sapwood thus 
can be seen as a resistance trait. Exclusion was calculated as the 
cleansed proportion of the infected sapwood: exclusion = x/y, 
where x is the distance of fungus-free sapwood between the wound 
and the fungus, and y is the total cUstance of sapwood between the 
wound and the fungal front. Both x and y were calculated as mean 
values across all four ramets. 

Arnerup et al [24] observed no fungal material in 2 7 % of the 
infected plants and argued that this might either be due to 
infections failing at an early stage and never reaching the sapwood, 
or due to a later, full exclusion of the infection (x = y). They 
further observed that excluding or including the potentially failed 
infections had littie effect on the overall results. To investigate 
whether these 27% could be genetically explained as the abilit)' of 
the host to stop the fungus from entering the wound altogether, 
this was also mapped as a separate trait. Infection prevention was 
quantified as the proportion of ramets of each progeny that did not 
contain any conicUophores after incubation and had no more than 
2 mm inner bark lesion up- or down-stem from the wound after 4 
weeks of infection. 

DNA extraction and SNP mariner design 

Genomic DNA was extracted from spruce needles of the two 
ramets remaining from each progeny after the resistance assays, 
using the DNeasy Plant Mini Kit (QIAGEN, Germantown, MD). 
An lUumina 3072 SNP Golden Gate Assay was developed by 
merging SNPs from a number of different resequencing and 
genotyping projects. AH markers included in the assay had a score 
higher than 0.6 in the lUumina OPA design. The majority of SNPs 
(1879) were originally identified in and designed for Picea glauca 
and later also tested on and found to be variable in a small number 



PLOS ONE I www.plosone.org 



2 



July 2014 I Volume 9 | Issue 7 | e101049 



Picea abies Linkage Map Locates Resistance QTLs 



of P. abies individuals [25]. The rest of the SNPs came from three 
different sources; 250 from a previously developed 768 SNP 
Golden Gate Assay [26], 269 from sequencing pooled PGR 
products using lUumina next-generation sequencing technology 
(Ma XF, Zaina, G, Kallman T, Chen J, Morgante M, Lascoux M, 
unpubhshed), and 674 SNPs identified in a single individual 
subjected to mRNA sequencing using Dlumina technology [27]. 

Linkage analysis and map construction 

The Illumina Golden Gate assay was used to genotype the 
mapping population according to standard procedures [28] at the 
SNP Technology Platform, Uppsala University (http://molmed. 
medsci.uu.se/SNP+SEQ+Technology+Platform/Genotyping/). 
The SNP markers formed three clearly distinguishable clusters 
(A/A, A/a, a/a), and those polymorphic between the parental 
trees were used as mapping data. Their scgr(;gation patterns 
within the 247 progenies were observed and analysed using 
JoinMap 3.0 (Kyazma) [29] and visueilized using MapChart 2.1 
(Plant Research International) [30]. To be accepted, a marker 
had to be scored as present or absent (i.e., not as missing data) 
in at least 80% of the individuals. Markers with a segregation 
pattern deviating strongly from the expected ratio (P<0.005, x^) 
were also omitted. The segregation data were coded as CP, i.e. 
a population resulting from a cross between two heterogeneously 
heterozygous and homozygous diploid parents. Linkage groups 
were determined using pairwise comparisons at minimum 
likelihood of odds (LOD) value of 4 and a recombination 
frequency threshold 0.4. As no prc\ious information on marker 
order existed, the internal order within groups was accepted as 
presented by the Map 2-function (regression mapping, Kosam- 
bi's mapping function, adding markers one at a time, accepting 
the position that results in the best goodness-of-fit for the map, 
discarding markers that results in a jump in for goodness-of- 
fit of 5 or more), after discarding any markers that contributed 
to the mean %^ for for goodness-of-fit of the group by more than 
4. Finally, to screen for double recombination events, an 
average genotype probability (-LoglO(P)) threshold value of 0.3 
was set for each group, eliminating all markers with a higher 
average. Similarly, markers involved in highly improbable 
genotypes (-LoglO(P) >2.0) in more than 10 of the 247 
individuals were also removed. 

From each gene carrying more than one marker, every marker 
but the best genotyped one (defined by least missing data and least 
distorted segregation) was deemed as redundant. Markers were 
deemed as originating from the same genes as another marker, if 
they shared identical accession numbers in GenBank, TAIR or the 
Picea glauca genome and mapped within 10 cM of each other. 
Even though recombination in P. abies, as in many plants, seem to 
occur mainly within the genes [31-33], the population of the 
present study is not big enough and the SNP markers not 
numerous enough to expect many retxjmbination events between 
these redundant markers. As they are expected to have identical 
genotypes, the discordance can be used as a way of measuring 
genotyping error in the data. The effect of this genotyping error 
was estimated by summing the largest possible distance between 
two markers on every gene with redundant markers and divide by 
the number of such genes. The QTL ranges were subserjuendy 
adjusted at both ends by this mean in order to compensate for the 
genotyping error. 

Genome length was estimated using method 4 of Chakravarti et 
al. [34], in which the observed length in cM of each linkage group 
is multiplied by (m +1)/ (m -1), with m being the marker count for 
each group. Genome coverage was calculated as (total observed 
genome length)/(total estimated genome length), and as the 



proportion c of the genome within d cM of a marker, using the 
formula c = 1 - [35] with L being the estimated genome 

length and n the number of mapped markers. As these methods 
assume normal distribution of data, we tested for this by dividing 
every linkage group into 10 cM fragments and counting the 
number of markers, x, in each, with x ranging from 0 to 15. A 
Poisson distribution function, P(x) — n,''e'''/x!, was used to 
compare this to a normal distribution, with P(x) being the 
probability of x markers per 10 cM fragment and |J, the genome- 
wise average number of markers per fragment. The expected 
distribution of intervals with x markers was calculated by 
multiplying P(x) for each x (0-15) with the total number of 
10 cM intervals in the map. Finally, a Kolmogorov-Smirnov test 
for two populations was conducted to decide whether the observed 
distribution of markers was likely similar to the expected 
distribution [36]. 

QTL analyses 

BLUP values for lesion lengths and fungal growth in sapwood, 
and mean values for excluded proportion of total infection length 
[24], were used in th(; QTL analyses, whereas raw proportion data 
was used for the infection prevention trait. 

The likelihood of resistance QTLs was determined using the 
MapQTL (Kyazma) mixture model method [37] according to the 
following approach. First, interval mapping [38] (200 iterations) 
was employed for every 1 cM throughout the linkage map in order 
to identify markers significandy associated with the trait. The 
significance was calculated by a permutation test (P<0.05, x^) 
(5000 permutation.s). QTL peaks significant at either genome or 
linkage group level were picked for further study. Then, the 
marker closest to the respective regional QTL peak was tentatively 
picked as cofactor and tested using the restricted MQ_M-mapping 
algorithm (as there were multiple putative QTLs for each trait). If 
this altered the position of the QTL peak, a new cofactor was 
picked. This process was repeated until every designated cofactor 
was located as closely as possible to the peak of their respective 
QTL. The confidence interval for each resulting QTL was 
determined as a decrease in 1 LOD unit on both sides of the QTL 
LOD-peak, adjusted at both ends by the mean effect of the 
suspected genotyping error (see above). Finally, a Kruskal-Wallis 
test was conducted to determine the individual level of association 
between the markers within the confidence interval and the 
examined traits. A QTL was only deemed significant if it 
contained at least one significant marker according to the 
Kruskal-Wallis test (P<0.05, marker level). Such markers were 
noted as potentially causative for the trait along with the 
designated cofactor (Table 1). AU calculations were performed 
using MapQTL 4.0 (Kyazma) [37]. 

After the significant QTLs were identified, the false discovery 
rate (FDR) [39] was estimated in order to determine the rate of 
type I errors (i.e. false discoveries). The individual P-values for 
every significant QTL, for the individual traits as well as all pooled 
together, was listed as P(l), P(2) ... P(m), where P(l) was the 
smallest and P(m) the largest. According to the Benjamini- 
Hochberg (BH) procedure [39], the FDR is controlled at the 
specified significance level q if P(i) < q*i/ m, where P(i) is the P- 
value of a given QTL in the list, i is the numerical listed position of 
that QTL (i.e., between 1 and 13 for the pooled traits, since 13 
QTLs were found) and m is the number of QTLs considered. 
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Table 2. Statistics for SNP markers segregating between the parental trees S21K76221 and S21K7621678. 





Heterozygous for both parents 


for S21K7622ie2 


for S21K7621678 


Total 


Polymorphic' 137 


292 


309 


738 


Distorted (0.005<P<0.05) 7 


16 


12 


35 


Positioned 118 


278 


290 


686 


Unlinked 19 


14 


19 


52 


'Remaining after removing heavily distorted loci (P<0.005) or missing data in 


>20% of the mapping populatio 
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Results 

Polymorphic SNP markers segregating in the progeny 

Screening the 247 progenies and the parental trees 
S21K7622I62 and S21K7621678 against the Illumina 3072 
SNP Golden Gate Assay discovered that 1014 assays failed to 
produce useful results. This resulted in 2058 (67%) successfully 
scored SNP markers. Of these, 874 (43%) SNPs proved to be 
polymorphic among the parental genomes and useful as genetic 
markers. The marker names, sequences and corresponding 
accession numbers in GenBank, TAIR and the Picea glauca 
genome are presented in Table SI. Analysing the SNP markers 
revealed that the 874 markers originated from 769 unique 
transcribed genes. Sixty-nine of these genes gave rise to more 
than one SNP marker each. In total, 174 of the 874 SNP markers 
came from such multi-SNP generating genes, i.e. the dataset had 
105 redundant markers. The most widely spaced redundant 
markers of each corresponding gene mapped on average 0.97 cM 
of each other. Since not enough recombination events should 
occur within these genes to justify this figure, redundant markers 
were removed so that each of the 769 transcribed genes were 
represented by only one SNP marker .After further trimming the 
marker set by removing those that were heavily distorted (P< 
0.005, X") or poorly characterized (missing data in >20% of the 
population), 738 markers remained. Of these, 292 SNPs were 
heterozygous in S21K7622162 (segregating 1:1), 309 in 
S21K7621678 (1:1), and 137 were heterozygous in both parents 
(segregating 1:2:1) (Table 2). Thirty-five SNP markers (4.7%) 
segregated at a ratio distorted from the expected 1:2:1 (7 markers) 
or 1:1 ratio (28 markers) (0.005<P<0.05, x\ 

The SNP analysis showed that eight individuals (four pairs) of 
the originally 251 members of the mapping population were 
between 97.5% and 99.3% identical in terms of SNP genotypes. 
As this probably was due to DNA contamination at some stage, 
one member of each pair was omitted, resulting in 247 trees. 

Twelve well supported linkage groups 

JoinMap and MapQTL input files are included as Tables S2 
and S3. 

The markers that proved to be involved in many improbable 
genotypes (average -LoglO(P) for linkage with neighbouring 
markers above 0.3, or -LoglO(P) above 2.0 in more than 10 
individuals of the population) were deemed suspicious and omitted 
during the mapping process. After this final screening, which 
removed 20 markers, JoinMap 3.0 (Kyazma) organized 686 of the 
remaining 7 1 8 into a consensus map of twelve linkage groups 
(Figures 1-3) as per result of the Map 2-function. No marker in the 
resulting linkage groups contributed with more than 4 to the mean 
of goodness-of-fit for the group. Although a minimum LOD 
score of 4 was arbitrarily determined as sufficient to accept a 
linkage group as significant, this threshold was not relevant as all 



twelve groups remained stable up to a LOD score of 10. The 
groups varied in size from 1 10.7 to 186.2 cM and included from 
32 to 73 markers (Figures 1-3, Table 3). The twelve linkage groups 
covered 1889.2 cM in total and had an average density of 
2.8 cM/marker, varying across the groups from 2.2 to 3.5 cM/ 
marker. The genome wide physical size to genomic distance ratio 
was 10.4 MB/cM, if assuming the Norway spruce genome to be 
19.6 GB large [11]. Across the groups, assuming each chromo- 
some is 1.63 GB large, the ratio varied between 8.8 and 14.7 MB/ 
cM (Table 3). 

Genome length was estimated as 1958 cM, using the (m +l)/(m 
- 1) method [34]. The map thus covers 96.5% of the estimated 
genome and has a 97.0% probability that any given locus in the 
genome lies at most 1 0 cM from a positioned marker (determined 
from c — see Material and Methods). Whether the observed 
distribution of markers across 10 cM blocks of the genome was 
different from the expected random distribution was tested with a 
Kolmogorov-Smirnov test. The difference was small (D = 0.08) 
and non significant (P — 0.53). The distributions are displayed in 
Figure 4. 

Of the 35 markers displaying distorted segregation, 3 were 
unlinked while 32 (91.4%) were distributed over 9 of the 12 the 
linkage groups (Table 3). Six groups had just 1 or 2 distorted 
markers, while the other three groups (1,2 and 6) had 7, 12 and 5 
markers, respectively. In three cases two or more neighbouring 
distorted markers constitute a cluster. On LG 1, five distorted 
markers were positioned between 20.4 and 23.3 cM, whereas on 
LG 2, three others were clustered between 84.9 and 87.4 cM. Two 
distorted neighbouring markers were found on LG 6 (at 35.1 and 
36.9 cM), while the other 14 on these groups were distributed 
without any other such marker as their closest neighbour. 

Resistance variation detected in the population 

Two-hundred and fifty-one full sibling plants stemming from a 
cross between the P. abies parents S21K7622162 and 
S21K7621678 were used for scoring resistance against H. 
parviporum infection. The resistance variation among the 
progeny, expressed as lesion lengths around the wound and fungal 
growth within the sapwood, has been described previously [24] . In 
addition, the exclusion of fungus from the host was calculated as 
the proportion of fungal growth in sapwood eliminated from the 
wound and outward, and the ability to prevent infection was 
calculated as the proportion of ramets completely lacking 
conidiophores after 4 weeks of infection. The distributions of the 
resistance over the progeny for all traits are displayed in Figure 5. 

Resistance mapping reveal multiple QTLs 

As visible from Figures 1-3 and Table 1, the traits exclusion 
(purple in Figures 1-3), lesion length (green), growth in sapwood 
(red) and infection prevention (orange), were associated with 
four, two, four, and three specific QTL regions, as per defined by a 
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Figure 1. Linkage groups 1-4 of the Picea abies genome and the QTLs for various resistance traits. Names of the SNP markers are 
displayed on the left of the linkage groups. Genetic distance (cM) is indicated on the right. SNP markers are described in detail in Table SI. Graphs 
denote QTL effects for the current group. Red curves indicate fungal growth within sapwood, purple fungal exclusion and orange curves indicates 
infection prevention. Complete and dashed vertical lines describe 0.1% and 5% levels of significance for the individual trait and group. Wide colored 
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areas between curve and group show the QTL confidence interval based on a 1 LOD drop from the QTL peak. The colored marker denotes SNPs 
under the QTL confidence interval, the stars level of significance according to the Kruskal-Wallis test (ranging from p<0.1 (*) to p<0.0005 (******) and 
bold style the designated cofactors. "D" after a marker name indicates segregation pattern deviating from the expected Mendelian ratios of 1 :1 or 
1:2:1 (0.005<P<0.05, x^). Image created using MapChart 2.1 (Plant Research International). [28] 
dor:1 0.1 371 /journal.pone.01 01 049.g001 



1-LOD interval around a significant QTL LOD-peak (permuta- 
tion test, P<0.05, x^), containing at least one marker significant at 
P<0.05, according to the Kruskal-Wallis test, and adjusted at 
both ends by 0.97 cM in order to compensate for genotyping 
error. The QTL regions contained nine, four, seven, and seven 
such SNP markers, respectively. The QTLs for each trait 
explained similar percentages of the phenotypic variation; 4.8— 
10.1% for die exclusion (25.7% in total, if additive), 5.3-6.7% for 
the lesion lengths (12% in total), 4.6-8.4% for the growth in 
sapwood (25.1% in total), and 5.3-7.2% for the infection 
prevention (18.4'yo in total), while the significant (P<0.05, % ) 
LOD peaks varied between 3.31 and 5.79 across the traits and 
QTLs. Three QTLs (1 for lesion lengths, 2 for growth in sapwood) 
were significant at the genome level; the others only at the 
chromosome level. The genes corresponding to each significant 
SNP and the corresponding cofactor can be seen in Table 1. 

The BH-procedure [39] showed that for the QTLs from 
individual traits as well as for all QTLs in the study together, the 
false discovery rate was controlled at q = 0.05, since P(i)<q*i/m 
for every i (QTL) at that level of significance (see Material and 
Methods). 

Among these resulting 27 SNP markers, we identified four 
markers in genes with homology (BlastX, MaxScore >378, E- 
value < le-125) to Pinaceae genes with a suggested role in defense 
response: a hydroxycinnamoyl CoA shikimate/quinate hydroxy- 
cinnamoyltransferase [40], a 4-coumarate CoA-hgase [41], a 
R2R3-MYB transcription factor [42], a leucoanthocyanidine 
reductase [43], (Table 1). These are specifically involved in the 
phenylpropanoid and flavanoid pathways. 

Discussion 

One of the purposes of QTL mapping is to investigate the 
number of involved loci in complex traits and their individual 
importance. Another coveted benefit of QTL mapping is the 
identification of candidate genes behind these traits, which makes 
it essential to be able to decipher the mapped area into actual 
sequence. Because of the diflBculties in assembling conifer DNA 
into scaffolds containing multiple genes, the number of known 
neighbouring genes to any given identified gene in the Picea abies 
genome is virtually zero. Genetic linkage can circum\'ent this 
obstacle, but only if the markers are fashioned from known 
sequence, i.e. not anonymous. An example of such markers is 
SNPs located in transcribed genes. Since the P. abies whole 
transcriptome has been sequenced and a large part of the genome 
assembled [11], and transcriptome profiles are available for P. 
glauca and P. silchensis [44], the potential for SNP based linkage 
maps to identify causative genes for important traits is growing. 
Extreme high-density maps containing gene-based markers for 
every single gene seem possible in the foreseeable future. 

In the present study, a full-sib P. abies family of 247 progenies, 
stemming from a cross between the parents S21K7622162 and 
S21K7621678 [24], was used to construct a consensus linkage 
map based on segregation patterns of 686 SNP markers. The 
markers were derived from an lUumina 3072 SNP Golden Gate 
Assay, resulting in 2058 successful assays and 874 informative 
markers. The success rate from SNP design to successful SNP core 
(67%) was consistent with what has been reported from conifers 



earlier [19,45-47]. The final linkage map is one of the most 
saturated P. abies linkage map to date and the one most enriched 
in mapped genes, although well behind some other conifer maps 
such as the 1 745 genes mapped in Picea glauca [20] or the 2841 in 
Pinus taeda [22]. Quantitative trait loci for four distinct traits of 
resistance against Heterobasidion parviporum infection were 
positioned on the map. This is the first report of resistance QTLs 
in any host to this economically important pathogen, which is a 
vital step towards the identification of the causal genes. 

The twelve linkage groups of the present map correspond to the 
twelve chromosomes of P. abies and most other species of the 
Pinaceae family [48]. The linkage groups are all large, varying 
between 1 10.7 and 186.2 cM and between 32 and 73 unique loci. 
As visible from Table 3, this variation in size among the linkage 
groups corresponds to a relative size of 5.9 to 9.9% of the total 
map length. This mimics the variation in relative size among P. 
abies chromosomes, which span from 6.0 to 10.7% and 5.8 to 
10.8% of the total genome size as per two previous estimations 
based on morphometries of karyotype data [49-50] . Of course, 
this correlation is only circumstantial evidence that the largest 
linkage group reflects the largest chromosome, but still suggests not 
only that the linkage groups indeed represent the actual 
chromosomes, but also that the SNP markers have been derived 
from all parts of the genome. This assumption was strengthened by 
the fact that the map covers 96.5% of the estimated genome 
length, and that every loci is with 97% probability located within 
1 0 cM of a mapped marker. Thus, the map should be saturated 
enough to allow QTL detection for every causative region for the 
respective trait, and even though adding mon; markers to the map 
certainly would make the existing groups denser, it would not 
substantially increase the map size in centimorgans. This 
assumption has been confirmed in a saturated Picea glauca map 
[20], which was increased from 1301 to 2211 markers, while 
decreasing from 2086.8 to 2065 centimorgans. 

There is strong evidence that the marker organisation into 
groups is valid, since ever)' group remained stable up to LOD 10 
and no marker had an average genotype probability (-LoglO(P)) 
higher than 0.3. In total, the twelve groups span 1889.2 cM, with a 
marker density of 2.8 cM/marker. This is fairly similar to the only 
other P. abies map of similar saturation available [17-18], which 
reported a size of 2035 cM and 2.6 cM/marker using 775 
markers. As expected, the 1 74 SNPs sharing a mutual origin with 
another marker always did map very close to their intragenic 
neighbours - the average distance between such markers 
(measured on the most widely spaced two of each gene) was 
0.97 cM. Recombination in P. abies, and plants in general, has 
indeed been suggested to occur mainly within genes [31-33], but 
for enough such events to occur within these gc-ncs to justify this 
average distance, the genes would have to be many times larger 
than the genome average. Instead, this might reflect the 
genotyping error rate for the experiment. Knowing that two of 
markers that were expected to locate very close to the same 
position actually were separated by on average 0.97 cM infers that 
ever)' mark('r position in the map should be considc'red as 
potentially being misplaced by this much. To compensate for this, 
the QTL confidence interval defined by the 1-LOD drop was 
adjusted by 0.97 cM. 
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Figure 2. Linkage groups 5-8 of tKie Picea abies genome and the QTLs for various resistance traits. Names of the SNP markers are 
displayed on the left of the linkage groups. Genetic distance (cM) Is Indicated on the right. SNP markers are described In detail In Table SI. Graphs 
denote QTL effects for the current group. Red curves Indicate fungal growth within sapwood, green lesion length, and purple fungal exclusion. 
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Complete and dashed vertical lines describe 0.1% and 5% levels of significance for the individual trait and group. Wide colored areas between curve 
and group show the QTL confidence interval based on a 1 LOD drop from the QTL peak. The colored marker denotes SNPs under the QTL confidence 
interval, the stars level of significance according to the Kruskal-Wallis test (ranging from p<0.05 (**) to p<0.005 (»»**) and bold style the designated 
cofactors. "D" after a marker name indicates segregation pattern deviating from the expected Mendelian ratios of 1:1 or 1:2:1 (0.005<P<0.05, x^)- 
Image created using MapChart 2.1 (Plant Research International). [28] 
doi:1 0.1 371/journal.pone.01 01 049.g002 



Markers exhibiting segregation distortion can be a sign of 
erroneous genotyping data, especially if the markers are randomly 
dispersed across the map and do not aggregate into clusters [51]. 
On the other hand, clusters of distorted markers may weaken map 
structure [52]. In this study, 4.7% (32) of the 686 positioned SNP 
markers segregated at a ratio distorted from the expected 
Mendelian ratios of 1:1 or 1:2:1 (0.005<P<0.05, This 
proportion is similar to the 6% reported for a P. abies map 
previously [17-18], 1)ut lower than the 12% reported for a P. 
mariana x P. ruhens cross [52] and higher than the 1.9'M) reported 
for P. glauca [18]. As for other conifers, similar distortion levels 
ha\-e been reported for maps oi Pinus species, such as 9"/o for P. 
sylveslris [53], 1-2.4% for various populations oi P. pinaster [54], 
7.4-7.5'!^) for two pedigrees of P. laeda (P<().()05, x^) [55] and 
12% for a P. palustris x P. ellioUii cross [56], while 5.4% of 1364 
markers has been reported as distorted (P<0.01, y^) in Crypto- 
meria japonica [57]. In this study, ten distorted markers formed 
three minor clusters (i.e. had at least one distorted immediate 
neighbour) while the other 22 were scattered across the map (had 
no such neighbour). However, since the distortion level was not 
extreme (0.005<P<0.05, x^) and their presence did not force any 
other markers out of the groups, they were included. 

When distorted markers are clustered, the reason may be 
selection bias. As certain genotypes are lethal or otherwise 
detrimental to viability, there wiU be an inherent selection for 
these areas, so-called viability QTLs. Viability QTLs are of course 
interesting in their own right as they may provide insight into 
genes with a pivotal impact on fitness, but they are not necessarily 
involved in the biology of host resistance. Without selection, the 
segregation pattern in a family would be expected to foUow 
Mendelian ratios all across the genome. At a viability QTL, these 
ratios would be distorted due to the genotype-dependent mortality 
[58]. Also, viability QTLs would be expected to be significant for 
all measurements, as the effect is based on selection prior to 
phenotyping. It is not likely that the QTLs of this study can be 
considered as viability QTLs, because only three of the 27 SNP 
markers found within a QTL confidence intervals in this study 
were distorted (P<0.05, y^), all in different QTLs, on two different 
linkage groups (2, 2 and 6), and none of the QTLs were significant 
for more than one measurement. 

The finalized map was used to locate QTLs for four 
phenotypicaUy distinct aspects of resistance against H. parviporum 
infection using interx^al mapping [38]. Interval mapping assumes 
that the measured traits follow a normal distribution across the 
mapping population. As visible from Figure 5, at least the 
exclusion and infection prevention traits do not foUow a normal 
distribution but rather seem to have a spike at 0 effect (a nuU 
phenotype). Interval mapping can still provide reliable results, if 
the spike at 0 is small and the rest of the data set normally 
distributed and not much larger than 0 [59]. This is probably not 
the case of the exclusion and infection prevention traits. In order 
to verify the QTLs observed for these traits, a Kruskal-Wallis test 
was performed. A Kruskal-WaUis test is a non-parametric test that 
makes no assumption about the probability distributions of the 
traits, and it was performed on each locus separately without using 
any linkage information [37] . The standpoint was that loci deemed 
significant both by interval mapping and the Kruskal-WaUis test 



would be considered as valid QTLs even if the data was not 
normally distributed. This approach was also employed for the 
normally distributed traits for Lesion length and Growth in 
sapwood (Figure 5). 

To avoid scoring false positive QTLs while at the same time 
maintaining enough statistical power to detect as many true 
positive QTLs as possible, the false discovery rate (FDR) was 
estimated at the level of significance q. According to [60] , all traits 
in a multitrait study should be tested simultaneously in a FDR- 
controlling approach. The BH-test [39] was used both for all traits 
together and for each trait individually. For both approaches, the 
FDR was controlled at q<0.05, y^. Thus, if accepting all 13 QTLs 
as true positives, it is simultaneously assumed that on average 5% 
of these would be false positives. 

According to the QTLs, neither trait is controlled by a single 
gene or locus, but significantiy affected by at least two to four 
regions of the genome. No overlap between QTLs for the separate 
traits was identified which suggests that the separate traits measure 
different aspects of host resistance. The percentages of phenotypic 
variation explained (PVE) levels detected for individual QTLs 
(4.6-10.1%), or combined for each trait (12-25.7%), suggest that 
the measured traits are complex in nature and probably controlled 
by a large number of loci. This is expected as defence against a 
necrotroph such as H. parviporum is not based on gene-for-gene 
interactions, but rather on the employment of a large battery of 
genes involved in systemic resistance. Similar PVE levels were 
found for the respective loci associated with bud set and height 
growth in a 283 individual population of P. mariana; 4—11.7% 
and 6.5-12.3% [61]. In a composite P. glauca map, two 
populations of 500 respective 200 individuals were used to detect 
QTLs for bud flush, bud set and height growth, explaining at the 
most 16.4, 22.2 and 10.5% of the respective variation of these 
traits [62], As for other conifers, the PVE of lu-iglit grcnvth, wood 
density and fibre length QTLs in Pseudotsuga menziesii peaked at 
17.7, 14.9 and 15.7% [63] in a population of 320 trees (40 from 
each of 8 fuU-sib families), and resistance to Dothistroma needle 
blight in Pinus radiata, using 202 individuals from 6 fuU-sib 
families,was only explained to 4.8% by the strongest loci [12]. 

In every QTL mapping experiment, there is a danger of 
overestimating the importance of identified QTLs due to the so 
called Beavis effect [64]. According to Beavis, undetected QTLs 
with small effects inflate the estimated impact of closely located, 
detected QTLs. This bias has been reported to increase with LOD 
significance threshold and decrease with population size and map 
saturation [65]. Typically, large PVE-values in low density maps 
and smaU mapping populations run a greater risk of being affected 
by the Beavis effect. Indeed, Pelgas and coworkers [62] compared 
results from two mapping populations of 260 and 500 progenies 
respectively. They found that only 24% of the QTLs found in the 
larger population were also identified in the smaller, and that a 
randomized subset of 250 progenies out of the 500 only identified 
29% of the QTLs. Furthermore, the PVE values of QTLs in the 
smaller population were twice as large as those obtained with the 
larger. This suggests that our population of 247 individuals was not 
large enough to disregard the Beavis effect and high PVE values 
should be regarded with caution as they might be the result of 
overestimation. However, this effect is a lesser factor for QTLs of 
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Figure 3. Linkage groups 9-12 of the P/cea abies genome and the QTLs for various resistance traits. Names of the SNP markers are 
displayed on the left of the linkage groups. Genetic distance (cM) is indicated on the right. SNP markers are described in detail in Table SI. Graphs 
denote QTL effects for the current group. Red curves indicate fungal growth within sapwood, green lesion length, and orange curves indicates 
infection prevention. Complete and dashed vertical lines describe 0.1 % and 5% levels of significance for the individual trait and group. Wide colored 
areas between curve and group show the QTL confidence interval based on a 1 LOD drop from the QTL peak. The colored marker denotes SNPs 
under the QTL confidence interval, the stars level of significance according to the Kruskal-Wallis test {ranging from p<0.05 (**) to p<0.005 (****) and 
bold style the designated cofactors. "D" after a marker name indicates segregation pattern deviating from the expected Mendelian ratios of 1:1 or 
1:2:1 {0.005<P<0.05, x^). Image created using MapChart 2.1 {Plant Research International). [28] 
doi:1 0.1 371 /journal.pone.01 01 049.g003 
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Figure 4. Distribution of mariners. Observed and expected Poisson distribution of the marl<ers frequencies for each 10 cM increment of the map. 
doi:1 0.1 371 /journal.pone.01 01 049.g004 



lower PVE values. Jermstad et al [66-67] found that expanding a 
population from 98 to about 400 only changed the PVE interval 
for identified QTLs from 1.2-1 1.5% to 0.7-9.5%. Since aU QTLs 
identified in this study fall within a similar range (PVE 4.6-10. 1 %), 
it is conceivable that the Beavis effect has played a minor role in 
determining their PVE. 

When screening the infected trees for H. parviporum conidi- 
ophores, Arnerup et al. [24] concluded that 27% of the ramets 
contained no fungi. The reasons for this could be either technical, 
for example poorly infested wood blocks, or biological, i.e. some 
inherent trait that allows the host to prevent the establishment of 
the fungus in the wound. By mapping the proportion of immune 
ramets of each progeny as a trait, we discovered that 18.4% of this 
variation was explained by three QTLs on LG 1 , 2 and 1 1 
(Table 1). This result suggests that infection prevention at a very 
early stage is a heritable trait. As such a trait could potentially 
confer complete resistance to infection, rather than just limit the 
fungal spread within the tree, these QTLs might be of special 
interest for future P. abies breeding programmes. 

The size of the P. abies genome, the low density of coding regions 
per MB and the difficulties with assembling genomes rich in 
transposable elements will complicate the investigation of the whole 
sequence corresponding to each QTL. Li fact, as each assembled 
scaffold contains only one gene or less on average, we can at present 
only identify candidate genes from the QTLs if the map carries a 
SNP marker from them. The present linkage map contains 686 
gene-based markers, a mere 2.4%, of the 28,354 genes of the P. 
abies genome [11]. However, some of the SNPs within the QTLs do 
originate from genes with known functions in plant response to 
wounding or microbial infection and have previously been shown to 



be regulated during H. parviporum infection or belong to the same 
biosynthethic pathways. This warrants discussing them, despite the 
low proportion of represented genes. 

Lignin plays an important part in both chemical and 
mechanical resistance against pathogens on P. abies [68] . Several 
of the SNPs found within the QTL confidence intervals are known 
intermediates in the lignification process. The QTL on LG 1 1 
contained, at 114.3 cM, a SNP originating from the transcription 
factor R2R3-MYB1 1. The R2R3-MYB transcription factor famUy 
plays a role in ligiiin metabolism [42], and transcripts of tiie 
MYBll gene were shown to increase slightly but significantiy in 
the lignin enriched compressed wood of P. glauca compared to 
opposite wood [69]. One of the other QTLs for infection 
prevention, at 158.4 cM on LG 1, contained a SNP from a gene 
for hydroxycinnamoyl CoA shikimate/quinate hydroxycinna- 
moyltransferase (HOT). HCT has been shown to be involved in 
lignin biosynthesis in several plants, including Pinus radiata [40] 
and Populus euramericana [70]. Furthermore, a SNP from a gene 
for 4-coumarate CoA-ligase (4CL) was located at 76.0 cM on LG 
3, in a QTL for the fungal exclusion trait. This enzyme is involved 
in the conjugation of hydroxycinnamates with CoA, and thus also 
related to the biosynthesis of phenyl propanoid [41]. 

Another interesting observation was made in the QTL for 
fungal growth within sapwood on LG 6. A SNP from a gene for 
leucoanthocyaiiidin reductase (accession number BT 10950 in 
Table 1, LARl in [23]) was mapped to 182.6 cM. LARl is an 
important intermediate in the flavonoid pathway, converting 
leucoanthocyanidins to catechins, which subsequently are con- 
verted into the defence-related phenolics flavonoid-derived 
proanthocyanidins [71]. Both catechin and LARl mRNA-levels, 
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Figure 5. Distribution of resistance data. The distribution of observed resistance, measured as lesion length, fungal growth within sapwood, 
fungal exclusion, and ability to prevent infection throughout the mapping population, using all four ramets. 
doi:1 0.1 371 /journal.pone.01 01 049.g005 



as well as other intermediates in the flavonoid pathway, have 
previously been shown to significantly increase in P. abies clones in 
response to Heterobasidion. The increase was greater in less 
susceptible clones than in more susceptible [24,72]. Still, this 
pathway is not unique for reactions against Heterobasidion attack 
but it is rather involved in general host response to many kinds of 
stress. For example, the very same LAR gene identified in this 
study is also induced in P. abies in response to Ceratocystis polonica 
inoculation and wounding {LAR3 in [71]). Taken together, this 
strengthens the notion that the BT 10950 locus is important for P. 
abies resistance to H. parviporum infections. 

In this study, we used 247 fuU-sib progenies and 686 SNP 
markers derived from transcribed genes to construct one of the 
most saturated, and the most gene-enriched, P. abies maps to date. 
It constitutes a powerful tool for future mapping of important traits 
and understanding of recombination events in conifers. We used 
the map to identify QTL regions for four distinct resistance traits 
involved in host response to H. parviporum infection, explaining 
in total 12-25.7% of the variation for each trait. This showed that 
all traits are complex and controlled by several key genes on 
different chromosomes, as would be expected when measuring 
systemic resistance. When analysing the SNP markers within the 
QTLs, we identified four genes previously reported to play a part 
in plant defence against fungal attack, of which one has previously 
been shown to be regulated during infection and three play a role 
in the phenyl propanoid pathway. Albeit the significant SNPs 
likely represent only a few of the genes present in the sequence 



covered by the QTLs, they can be used as valuable starting points 
for further investigations. 
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